34 research outputs found
Why is unsupervised alignment of English embeddings from different algorithms so hard?
This paper presents a challenge to the community: Generative adversarial
networks (GANs) can perfectly align independent English word embeddings induced
using the same algorithm, based on distributional information alone; but fails
to do so, for two different embeddings algorithms. Why is that? We believe
understanding why, is key to understand both modern word embedding algorithms
and the limitations and instability dynamics of GANs. This paper shows that (a)
in all these cases, where alignment fails, there exists a linear transform
between the two embeddings (so algorithm biases do not lead to non-linear
differences), and (b) similar effects can not easily be obtained by varying
hyper-parameters. One plausible suggestion based on our initial experiments is
that the differences in the inductive biases of the embedding algorithms lead
to an optimization landscape that is riddled with local optima, leading to a
very small basin of convergence, but we present this more as a challenge paper
than a technical contribution.Comment: Accepted at EMNLP 201
Code-Switching as Strategically Employed in Political Discourse
There is extensive scholarship in the field of sociolinguistics on mediated political discourse as strategically employed to gain support in the run-up to and during elections. Among other things, this work reveals that the rhetorical success of politicians greatly depends on their ability to get the right balance between the expression of authority and solidarity in their speech performances. The use of code-switching in achieving such balance has been touched upon in some case studies but never studied in depth. I analyse the speech of Boyko Borisov, now Prime Minister of Bulgaria (and at the time of recording, a candidate for the position), in the framework of Bellâs (1984) audience and referee design theory, with reference to Myers Scotton and Uryâs (1977) views on code-switching. Borisov is found to employ two codes, a standard and a nonstandard one, characteristic of two different personae of his: the authoritative politician and the folksy, regular person. Depending on the situation, he chooses to act out either just one of these personae or both of them by switching between the two codes, thus maintaining the aforementioned vital balance between the expression of power and solidarity. The analysis reveals that the switches occur at specific points in the conversation, in line with existing theory on metaphorical code-switching, confirming that they are strategic in nature rather than random or accidental
Copenhagen at CoNLL--SIGMORPHON 2018: Multilingual Inflection in Context with Explicit Morphosyntactic Decoding
This paper documents the Team Copenhagen system which placed first in the
CoNLL--SIGMORPHON 2018 shared task on universal morphological reinflection,
Task 2 with an overall accuracy of 49.87. Task 2 focuses on morphological
inflection in context: generating an inflected word form, given the lemma of
the word and the context it occurs in. Previous SIGMORPHON shared tasks have
focused on context-agnostic inflection---the "inflection in context" task was
introduced this year. We approach this with an encoder-decoder architecture
over character sequences with three core innovations, all contributing to an
improvement in performance: (1) a wide context window; (2) a multi-task
learning approach with the auxiliary task of MSD prediction; (3) training
models in a multilingual fashion
A Probabilistic Generative Model of Linguistic Typology
In the principles-and-parameters framework, the structural features of
languages depend on parameters that may be toggled on or off, with a single
parameter often dictating the status of multiple features. The implied
covariance between features inspires our probabilisation of this line of
linguistic inquiry---we develop a generative model of language based on
exponential-family matrix factorisation. By modelling all languages and
features within the same architecture, we show how structural similarities
between languages can be exploited to predict typological features with
near-perfect accuracy, outperforming several baselines on the task of
predicting held-out features. Furthermore, we show that language embeddings
pre-trained on monolingual text allow for generalisation to unobserved
languages. This finding has clear practical and also theoretical implications:
the results confirm what linguists have hypothesised, i.e.~that there are
significant correlations between typological features and languages.Comment: NAACL 2019, 12 page
<i>Indicatements </i>that character language models learn English morpho-syntactic units and regularities
Character language models have access to surface morphological patterns, but
it is not clear whether or how they learn abstract morphological regularities.
We instrument a character language model with several probes, finding that it
can develop a specific unit to identify word boundaries and, by extension,
morpheme boundaries, which allows it to capture linguistic properties and
regularities of these units. Our language model proves surprisingly good at
identifying the selectional restrictions of English derivational morphemes, a
task that requires both morphological and syntactic awareness. Thus we conclude
that, when morphemes overlap extensively with the words of a language, a
character language model can perform morphological abstraction